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^ ' Abstract. We consider the fragments FO^, S2 n FO^, 112 n FO^, and A2 of first-order 
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J^' logic F0[<] over finite and infinite words. For all four fragments, we give character- 

izations in terms of rankers. In particular, we generalize the notion of a ranker to 
infinite words in two possible ways. Both extensions are natural in the sense that over 
finite words, they coincide with classical rankers and over infinite words, they both 
have the full expressive power of FO . Moreover, the first extension of rankers admits 
HH . a characterization of S2 n FO while the other leads to a characterization of 112 H FO . 

Both versions of rankers yield characterizations of the fragment A2 = S2 n 112. As a 
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Y^ • byproduct, we also obtain characterizations based on unambiguous temporal logic and 

unambiguous interval temporal logic. 
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^ : 1 Introduction 

O, 

ly-! I We consider fragments of two- variable first-order logic FO . Formulas are interpreted over words 

C^ ' which may be infinite or finite. Over finite words only, a large number of different characterizations 

^^ ■ of FO^ is known, see e.g. [7] or [1] for an overview. Some of the characterizations have been 

generalized to infinite words in [2]. In this paper, we continue this line of work. For this paper 
the main difference between finite word models and infinite word models is the following: Over 
^ ' finite words, FO and the fragment A2 = S2 H 112 have the same expressive power [8], whereas 

H . A2 is a strict subclass of FO over infinite words. Moreover, in the case of infinite words, FO 

is incomparable to S2 and 112. By definition S2 is the class of formulas in prenex normal form 
with two blocks of quantifiers starting with a block of existential quantifiers, and 112 is the class of 
negations of S2-formulas. Here and throughout the paper, we identify a logical fragment with the 
class of languages definable in the fragment. 

An important concept in this paper are rankers which have been introduced by Immerman and 
Weis [9] in order to give a combinatorial characterization of quantifier alternation within FO over 
finite words. Casually speaking, a ranker is a sequence of instructions of the form "go to the next 
a-position" and "go to the previous a-position" for some letters a. For every word, a ranker is 
either undefined or it determines a unique position. We generalize rankers to infinite words in two 
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possible ways. The main difference to finite words is that we have to define the semantics of "go to 
the last a-position" if there are infinitely many occurrences of the letter a. The first solution is to 
say that this modality evaluates to false and that the position is undefined. The second approach is 
to stay at an infinite position. For example, if a word has infinitely many a-positions but only two 
5-positions, then in the first semantics "go to the last a-position and from there, go to the previous 
b-position" would be false while in the second semantics it would be true and it would determine the 
last 6-position. By delaying the interpretation of modalities until some letter with finite occurrence 
is met, the second semantics is reminiscent of the lazy evaluation principle. We therefore call the 
second semantics lazy rankers. If we want to emphasize that we use the first semantics, then we 
often use the term eager ranker. The language L(r) of a ranker r consists of all words such that 
r is defined. A ranker language is a Boolean combination of languages of the form L{r) for some 
rankers r. 

In both ways, rankers admit natural combinatorial characterizations of the first-order fragments 
FO and A2 over finite and infinite words. Moreover, the eager semantics yields a characterization 
of S2 n FO while lazy rankers lead to a characterization of 112 H FO . We note that the decid- 
ability results for the first-order fragments lead to decidability results for the respective ranker 
fragments [2]. 

Let r°° be the set of all finite and infinite words over the alphabet F and let L C F°°. Our main 
results are 

• -L G FO if and only if L is an eager ranker language (Theorem 1) if and only if L is a lazy 
ranker language (Theorem 5). 

• L € S2 n FO if and only if L is a positive eager ranker language with some additional atomic 
modality (Theorem 2). 

• L € 112 n FO if and only if L is a positive lazy ranker language with some additional atomic 
modality (Theorem 4). 

• L € A2 if and only if L is a ranker language such that all instructions are starting with a 
modality "go to the first a-position" (Theorem 3). 

It turns out that unambiguous temporal logic [3] and unambiguous interval temporal logic [4] allow 
natural intermediate characterizations on the way from first-order logic to rankers. In particular, 
this yields temporal logic counterparts of the first-order fragments. Moreover, we present a way 
for converting formulas in unambiguous interval temporal logic into equivalent formulas in unam- 
biguous temporal logic, which does not introduce new negations (Propositions 1 and 2). This also 
leads to a new characterization of FO over finite words in terms of restricted ranker languages 
(Corollary 1). 

In this paper, all steps from fragments of first-order logic to interval temporal logic are based on 
characterizations in terms of so-called unambiguous polynomials; almost all other steps are effective 
syntactic transformations. The sole exception is the inclusion of some ranker fragment in 112 nFO • 
This step relies on a characterization of 112 H FO in terms of the alphabetic topology [2] . 

An extended abstract of our results will be presented at the 14th International Conference on 
Developments in Language Theory (DLT 2010). 

2 Preliminaries 

In the following F denotes a finite alphabet. For ^ C F we denote by A* the set of finite words 
over A. The set of infinite words is A^ , and ^4°° = A* L) A^ is the set of finite and infinite words. 
The empty word is e and we have {e} = 0°°. We denote potentially infinite words by lowercase 



Greek letters a, /3,7 whereas finite words are denoted by lowercase Latin letters u,v,w; for letters 
in r we use a, b, c, d. For a word a and a position x of the word, a{x) is the x-th letter of a. 
By \a\ € N U {00} we denote the length of a. Therefore a = a{l) ■ ■ ■ a{\a\) if a is finite and 
a = a(l)a(2) • • • if a is infinite. We extend this notation to intervals T C N and write a{T) for 
the word comprising the positions of a contained in T. In particular, we do not require that T is 
contained in the set of positions of a. Hence, for all a € r°^ we have a(0) = e, and a(N) = a even 
if the word a is finite. We call alph(a) the alphabet of a, i.e., the set of letters occurring in a. For 
a G r, a position labeled by a is called an a-position. By ini(a) we mean the imaginary alphabet of 
a, i.e., the set of letters occurring infinitely often in a. For ACT, the set of words with imaginary 
alphabet A is denoted by A™\ In particular, F* = 0™. 

A monomial (of degree k) is a language of the form A^ai • • • ^^Ofc^^^ for letters a^ £ F and sets 
Ai C F. It is unambiguous if each word of the monomial has a unique factorization uiai ■ ■ ■ Uka^fS 
with Ui G A* and /? € A^,-^^. A polynomial (of degree k) is a finite union of monomials (of degree 
at most k). It is called unambiguous if it is a finite union of unambiguous monomials. 

Example 1 The set of all finite words over an alphabet ^ C F is an unambiguous polynomial. 
We have 

yl* = 0°° U J A*a^°°, 
aeA 

i.e., a word is of finite length if it is either empty or if there is a last letter. 

Example 2 Consider the language L = {T \ {b})*aT°° n (F \ {c})*bT°° D F*cF°° of all words 
containing a c such that there is an a with no b to the left, and such that there is a 6 with no c to 
the left. This language is an unambiguous monomial since 

L = (F \ {b, c})*a(F \ {c})*bT*cT^. 

Moreover, L is the set of all words such that the first a occurs before the first b which in turn occurs 
before the first c. 

2.1 Fragments of First-Order Logic 

We denote by FO = F0[<] the first-order logic over words interpreted as labeled linear orders 
(without 00). As atomic formulas, FO comprises T (for true), the unary predicate X{x) = a for 
a gT, and the binary predicate x < y for variables x and y. The idea is that variables range over 
the linearly ordered positions of a word and X{x) = a means that x is an a-position. Apart from the 
Boolean connectives, we allow composition of formulas using existential quantification 3x : cp and 
universal quantification Vx: <p for if G FO. The semantics is as usual. We introduce the common 
shortcut _L for -iT. Typical names for formulas in this paper are ip, ip, g, "i?, /i, i', a. 

Every formula in FO can be converted into a semantically equivalent formula in prenex normal 
form by renaming variables and moving quantifiers to the front. This observation gives rise to the 
fragment S2 (resp. 112) consisting of all FO-formulas in prenex normal form with only two blocks of 
quantifiers, starting with a block of existential quantifiers (resp. universal quantifiers). Note that 
the negation of a formula in S2 is equivalent to a formula in 112 and vice versa. The fragments 
S2 and 112 are both closed under conjunction and disjunction. Furthermore, FO is the fragment 
of FO containing all formulas which use at most two different names for the variables. This is a 
natural restriction, since FO with three variables already has the full expressive power of FO. 

A sentence in FO is a formula without free variables. For a sentence (/? the language defined by (/?, 
denoted by L{ip), is the set of all words a G F°° that model (/?, i.e., a\= ip. We frequently identify 
logical fragments with the classes of languages they define (as in the definition of the fragment 
A2 = S2 n 112 for example). 



Example 3 Consider the formulas 

if = 3x\/y : y < X V X{y) ^ a and ip = \/x3y : y > x A X{y) = a. 

The formula ip £ Tj2 PlFO states that after some position there is no a-position, i.e., L{ip) contains 
all words with finitely many a-positions. Its negation -0 € FO^ n 11^ says that for all positions there 
is a greater a-position, i.e., L(V') is set of all words a with a £ im(a). Surprisingly, L{{p) is not 
definable in 112 and L{ip) is not definable in S2, cf. [2]. 

3 Rankers and Unambiguous Temporal Logics 

For finite words, rankers have been introduced by Immerman and Weis [9]. They can be seen as a 
generalization of turtle programs used by Schwentick, Therien, and Vollmer [6] for characterizing 
FO -definable languages over finite words. The main difference between rankers and turtle programs 
is that rankers either uniquely determine a position in a word or they are undefined, whereas turtle 
programs mainly distinguish between being defined and being undefined. 

Extending rankers with Boolean connectives yields unambiguous temporal logic (unambiguous 
TL). It is called unambiguous since each position considered by some formula in this logic is unique. 
Unambiguous TL has been introduced for Mazurkiewicz traces [3] which are a generalization of finite 
words. 

All of our characterizations of first-order fragments rely on unambiguous polynomials. A natural 
intermediate step from polynomials to temporal logic is interval temporal logic. Unambiguous 
interval temporal logic (unambiguous ITL) has been introduced by Lodaya, Pandya, and Shah [4] 
for finite words. They showed that over finite words it has the same expressive power as FO . 

In this section, we generalize all three concepts (rankers, unambiguous TL, and unambiguous 
ITL) to infinite words. For each concept there are essentially two natural choices for such gen- 
eralizations. Surprisingly, it turns out that one extension can be used for the characterization of 
the first-order fragment S2 n FO over T°° while the other yields a characterization of 112 H FO . 
Moreover, both semantics can be used for describing FO and A2. In fact, for A2 we use some 
fragment of rankers which conceals the difference between the two versions. 

Basically, all proofs of our main theorems have the following structure: Using some characteriza- 
tion in terms of unambiguous polynomials, we go from first-order logic to interval temporal logic; 
then formulas in interval temporal logic are transformed into equivalent formulas in temporal logic, 
which in turn can be easily converted into some ranker descriptions. The last step is to express 
ranker languages within some fragment of first-order logic. In all proofs, the main technical step 
is the conversion of unambiguous ITL into unambiguous TL without introducing new negations 
(Propositions 1 and 2). 

3.1 Rankers 

A ranker is a finite word over the alphabet {X^, Yq | a E F}. It can be interpreted as a sequence 
of instructions of the form X^ and Yq. Here, X^ (for neXt-a) means "go to the next a-position" 
and Yq (for Yesterday-a) means "go to the previous a-position" . Below, we will introduce a second 
variant of rankers (lazy rankers). For distinguishing, we will sometimes use the attribute eager for 
this first version of rankers. For a word a and a position x G N U {00} we define 

Xa(a, x) = min {y G N | a{y) = a and y > x} , 
Ya(Q, x) = max {y € N | a{y) = a and y < x} . 

As usual, we set y < 00 for all y G N. The minimum and the maximum of as well as the 
maximum of an infinite set are undefined. In particular, Xa(a, 00) is always undefined and Ya(a, 00) 



is defined if and only if a € alph(a) \ ini(a). We extend this definition to rankers by setting 
Xar{a,x) = r{a,Xa{a,x)) and Yar{a,x) = r(a, ¥„(«, x)), i.e., rankers are processed from left to 
right. We say that r{a,x) is undefined, if after processing some prefix of r on a, the resulting 
position is undefined. If r{a,x) is defined for some non-empty ranker r, then r{a,x) ^ cx). 

oi 0-2 0,3 im(a) 
1 — 2^3— ••• -00 ^ 

Figure 1: Signature of a = ai 02 03 • • • over lazy rankers 

Next, we define another variant of rankers as finite words over the alphabet {X^,Y^ | a G T}. 
The superscript i is derived from lazy and accordingly such rankers are called lazy rankers. The 
difference to eager rankers is that lazy rankers can point to an infinite position 00. The idea is 
that the position 00 is not reachable from any finite position and that it represents the behavior at 
infinity. We imagine that c» is greater than all finite positions and it is labeled by all letters in im(a) 
for words a. Therefore, it is often adequate to set 00 < 00, since the infinite position simulates a 
set of finite positions. For a word a and a finite position 2; G N we define X^(a,x) = Xa{a,x) and 
Y^{a,x) = Ya{oi,x). For the infinite position we set 



Ya(a,oo) 



00 if a G im(a) 

undefined else 

00 if a € im(a) 

Y„(a,oo) else 



i.e., Y^(a,oo) is undefined if a alph(a), and Y^(a,oo) = Ya{a,oo) is a finite position if a € 
alph(a)\im(a). As before, we extend this definition to rankers of length > 1 by setting X^ r{a, x) = 
r(a,X^(a,2;)) and Y^r(a,x) = r{a,Y^{a,x)). We denote by 

alphp(r) = {a € r I r G g{Xa, Ya,X^, Y^} s for some rankers q,s} 

the set of letters in F occurring in some modality of the ranker r. It can happen that r(a, 00) = 00 
for some non-empty lazy ranker r. This is the case if and only if r is of the form Y^ s and alphp(r) C 
im(a). 

If the reference to the word a is clear from the context, then for eager and lazy rankers r we 
shorten the notation and write r{x) instead of r(a,x). 

An eager ranker r is an X-ranker if r = X^ s for some ranker s and a G F, and it is a Y -ranker 
if r is of the form Y^ s. Lazy X^-r ankers and Y'^-r ankers are defined similarly. We proceed to define 
r{a), the position of a reached by the ranker r by starting "outside" the word a. The position 
r{a) is called r-position. The intuition is as follows. If r is an X-ranker or an X^-ranker, we imagine 
that we start at an outside position in front of a; if r is a Y-ranker or a Y*-ranker, then we start 
at a position behind a. Therefore, we define 

r(a) = r(a, 0) if r is an X-ranker or an X^-ranker, 
r(a) = r(a, cxd) if r is a Y-ranker or a Y^-ranker. 

On the left side of Figure 2, a possible situation for the eager ranker Y^ Yf,Xc being defined on 
some word a is depicted. The right side of the same figure illustrates a similar situation for the 
lazy ranker Y^X^Y^ Y^X^ with d € im(a) and a € alph(a) \ im(a). Note that the eager version of 
the same ranker would not be defined on a since d € im(a). 

Given these definitions, an eager ranker can either be undefined on a word a (if at some state of 
the evaluation of r{a) an instruction cannot be accomplished) or the ranker is defined on a and in 





no 6 no a no 6 no a 

I 1 I 1 

no c no c 

Figure 2: Eager and lazy rankers 

this case it determines a unique finite position of a. If a lazy ranker r is defined on a, then either 
r{a) = oo or it defines a unique finite position of a. For the empty ranker e we have eXa = Xa and 
eX^ = X^ as well as eYq = Yq and eY^ = Y^, i.e., the empty ranker e either starts at position or 
cx) depending on whether the next modality is in {Xa,X^ | a € F} or in {Y^, Y^ | a G F}. Moreover, 
e is defined on every word even though it does not determine a unique position of the word. The 
empty ranker is both eager and lazy. 

For an eager or lazy ranker r the language L{r) generated by r is the set of all words in F°° 
on which r is defined. A (positive) ranker language is a finite (positive) Boolean combination of 
languages of the form L(r) for a ranker r. A (positive) lazy ranker language is a finite (positive) 
Boolean combination of languages of the form L{r) for a lazy ranker r. Finally, a (positive) X-ranker 
language is a (positive) ranker language using only X-rankers. 

Example 4 The language L = (F \ {a,6,c})*a(F \ {6,c})*6(F \ {c})*cF°^ of ah words such that 
the first a occurs before the first b which in turn occurs before the first c is a positive X-ranker 
language because 

L = L(X,Y„) n L(X,Yfe). 

We have alphp(X5Ya) = {a,b} and alphp(XcYft) = {b,c}, and both rankers are X-rankers. 

Example 5 Consider the language L C F°^ consisting of all non-empty words with a as the first 
letter. A word is contained in L if and only if it contains an a-position and such that no 6 € F 
occurs to the left of the first a-position. Therefore, 

L = L{Xa) n F°°\UL(X,Yb), 
feer 

that is, L is a Boolean combination of the rankers X^ and X^ Yf, for 6 G F. 

3.2 Unambiguous Temporal Logic 

Our generalization of rankers allows us to define unambiguous temporal logic (unambiguous TL) 
over infinite words. As for rankers, we have an eager and a lazy variant. The syntax is given by: 

T I -V' I V^ W I V' A V' I X„v5 I Y,(^ I Gs I Hs I X^V^ I Y^99 I G| I H^ 

for a € F and c/?, ^ are formulas in unambiguous TL. The atomic formulas are T (which is true), 
and the eager modalities Ga (for Globally-no-a) and H^ (for Historically-no-a), as well as the lazy 
modalities G| (for lazy-Globally-no-a) and H| (for lazy- Historically-no-a). We now define, when 
a word a at a position x G N U {oo} satisfies a formula if in unambiguous TL, in which case we 
write a,x \= (p- The atomic formula T is true at all positions and the semantics of the Boolean 
connectives is as usual. For Z G {X^, Ya,X^, Y^ | a G F} the semantics is defined as follows: 

a,x \= Zip iff Z(x) is defined and a,Z{x) \= (p, 



and the semantics of the atomic modahties is given by 

G^ = ^X^T, HI = ^XfT. 

In order to define when a word a models a formula f, we have to distinguish whether ip starts with 
a future or a past modality: 

a\=Xaip iff a,0 \=Xa(p, a\=Ya(p iff a,oo \=Ya^, 

a\=Ga iff a,0|=Ga, a \= Ha iff a,oo|=Ha, 

a N Xa (/? iff a, h X| v?, " h Y^ 9^ iff a, oo ^ Y^ if, 

a^Gi iff a,OhG|, a h H| iff a,oo^Hi. 

The modalities on the left are called future modalities and the modalities on the right are called past 
modalities. The atomic modalities G^ and G| differ only for the infinite position, but the semantics 
of Ha and H| differs a lot: a |= H^ if and only if a € im(a) or a alph(a) whereas a ^ H| if and 
only if a alph(a). Every formula if defines a language L{ip) = {q G T°° \ a \= ip}. 

For C C {Xa, Yq, Ga, Ha, X|, Y^, G|, H|} we define the following fragments of unambiguous TL: 

• TL[C] consists of all formulas using only T, Boolean connectives, and temporal modalities in 
C, 

• TL^[C] consists of all formulas using only T, positive Boolean connectives (i.e., no negation), 
and temporal modalities in C, 

• TLx[C] consists of all formulas using only T, Boolean connectives, and temporal modalities 
in C such that all outmost modalities are future modalities, 

• TL J [C] consists of all formulas in TL^ [C] n TLx [C] . 

Frequently, we identify a class of formulas T with the class of languages {L(ip) \ ip E J-"}. We say 
that a language L C r°° is definable in a logical fragment T or simply J-'-definable, if L = L[ip) for 
some if ^ T . 

Example 6 Consider again the language L C r°° of example 5 consisting of all non-empty words 
with a as the first letter. This language is defined by each of following formulas: 

(^1 = 

fcer 

ip3 = XaT A f\ (GgVXbYaT) GTL+[Xa,Ya,Ga]. 

6er\{a} 

The formula fi says that there is some a-position and that no letter occurs before the first a- 
position. In particular, it uses a negation. The second formula (p2 is almost identical, but it uses 
the atomic modality H5. Due to the use of this implicit negation in a past-modality, no explicit 
negation is required. The surprising fact about 993 is that it neither uses negations nor the implicitly 
negated past-modality H5. It essentially says that before every non-o-position there is an a-position. 





XaTA/\-XaYbT 


GTLx[Xa,Ya], 


6gr 




f\^aHi 


G TLx[Xa, Ha], 



Example 7 The language L = (T \ {b})*aT°° with a ^ b consisting of ah words containing an 
a-position with no b to the left is defined by each of the following formulas: 

^1 = Xa^VbT GTL[X„,Y,], 

ip2 = G^VXfcY.T GTL+[X„,Y„Ga]. 

The first formula requires that the first a-position has no 6-position in the future, whereas the 
second formula states that there is either no b at all or that there is an a-position before the first 
6-position. Note that for a word in L, the position reached by the term X^ Yq in {p2 is not necessarily 
the first a-position of the word. In particular, formulas can be equivalent without visiting the same 
positions. Also note that the argumentation would not be valid for a = b. 



Inspired by the atomic logical modalities, we extend the notion of a ranker by allowing the atomic 
modalities G^ and Hq as well as G| and H|. We call r a ranker with atomic modality G^ (H^, G|, 
H|, resp.) ii r = sGa {r = sHq, r = sG|, r = sH|, resp.) for some ranker s. In this setting, 
r = Ga is an X-ranker, and r = Hq is a Y-ranker. Analogously, we can add atomic modalities to 
lazy rankers. Note that any ranker with some atomic modality is also a formula in unambiguous 
TL. We can therefore define the domain of an extended ranker r with some atomic modality by 

r{a,x) is defined iff a,x \= r. 

If r G s {Ga, Ha, G|, H| I a € F} is an extended ranker and r{a, x) is defined, then we set r{a, x) = 
s{a, x), i.e., r(a, x) is the position reached after the execution of s. The reinterpretation of rankers 
as formulas also makes sense for a ranker r £ {X^, Ya,X^, Y^}* without atomic modality, if we 
identify r with rT in unambiguous TL, which is justified since we have that r is defined on a if 
and only if a |= rT. 

Let C C {Ga, Ha,G|, H^}. A language L is a ranker language with atomic modalities C if L is a 
Boolean combination of languages L{r) such that r is either a ranker without atomic modalities 
or a ranker with some atomic modality in C. Similarly, the notions of lazy / positive / X-ranker 
languages are adapted to the use of atomic modalities. 

The following lemma shows that not only can we interpret rankers as formulas, but we can also 
transform fragments of unambiguous TL into ranker languages. 

Lemma 1 For L C r°^ the following holds: 

1. If L £ TL[Xa,Ya], then L is a ranker language. 

2. If L £ TL^[Xa, Ya, Gfl, Hq], then L is a positive ranker language with atomic modalities G^ 
and Ha- 

3. If L £ TL^[Xa, Ya, Ga], then L is a positive ranker language with atomic modality G^. 

4- If L ^ TLx[Xa, Ya, Ga], then L is a positive X-ranker language with atomic modality Ga- 

5. If L £ TLx[Xa, Ya], then L is an X-ranker language. 

6. If L £ TL^[X^, Y^, H|], then L is a positive lazy ranker language with atomic modality Ha. 



Proof: We observe the following basic equivalences (with Z^ € {Xq, ¥„, X^, Y^} and with = denoting 
equivalence of formulas on all words and all positions in N U {oo}): 

Zai^pytp) = Za^pyZatp, 

Za{ipAtlj) = ZaipAZai^. 

For a formula in TL[Xa, Yq] we use the equivalences to move all Boolean connectives to the outermost 
level, ending up in a Boolean combination of formulas of type rT for some ranker r. This shows 1. 
For formulas in TL'^[Xa, Yq, Gq, Ha] the same argument yields a positive Boolean combination of 
languages defined by rankers with atomic modalities Gq and Ha- (Of course, we do not apply the 
rule for negations.) Moreover, there is a ranker generated this way containing the atomic modality 
Hq if and only if the original formula uses Hq. This shows 2 and 3. The situation for formulas in 
TL^[Xa, Ya, H|] is similar, showing 6. If the first non-Boolean modality on each path of the syntax 
tree of the original formula is a future modality, then all rankers generated by the above rules start 
with future modalities, so 4 and 5 follow. D 

Lemma 2 For every non-empty ranker r there exist formulas Qr^'&r £ TL''"[Xa, Y^, Ga] such that 
for every a G r°° with r{a) being defined we have 

a,x\= Qr iff X > r{a), 
a,x \= -dr iff X > r{a). 

Proof: We use induction on the length of the ranker. Let a be a word such that r is defined on 
a. Consider the case r = sXa for some ranker s. Note that there must be an a-position since r 
is defined on a. For a position x of a we have x > r(a) if and only if we find an a-position y 
strictly smaller than x such that y > s{a). But this is equivalent to Ya(x) > s{a) since Ya(x) is the 
maximal a-position strictly smaller than x. For the formula 'dr we have to be more careful: For a 
position x of a we have x > r{a) if and only if there is no a-position strictly greater than x or if 
all such a-positions y satisfy y > r{a) which we already know how to express. Therefore, 

Qr = Ya gs 

■!?r = Ga V Xa Qr 

for r = sXa with Qs = T for s = e. So, if the ranker starts with X^, we view the empty ranker to 
be on a position in front of the word and hence all positions are strictly greater than it. 

Consider r = sYq for some ranker s. Again, we know that there is an a-position in a. For a 
position x on a we have x > r{a) if and only if there is no a-position strictly greater than x or for 
all such a-positions y we have y > s{a). But this is equivalent to Xa{x) > s{a) since Xa{x) is the 
minimal a-position strictly greater than x. A position x of a satisfies x > r{a) if and only if there 
is an a-position y strictly smaller than x such that y > r{a). Hence 

^^ = Ga V Xa ^s 
Qr — 'a ^r 



for r = sYa with t?^ = ± for s = e. This means that in the case that the ranker starts with Ya, 
we view the empty ranker to be on a position behind the word and hence all positions are strictly 
smaller than it. So for s = e the formula 'dr is equivalent to Gq. □ 



In the following lemma, we set oo < oo and also oo < oo. This is natural since with the single 
"position" oo we want to model the behavior of a word "after" all finite positions; in particular, if 
im(a) 7^ 0, then oo corresponds to infinitely many positions. 

Lemma 3 For every non-empty lazy ranker r there exist formulas Qr-,'&r € TL"*'[X^, Y^, H|] such 
that for every a G T°° with r{a) being defined we have 

a,x\= Qr iff X < r{a), 
a,x\='&r iff x<r{a). 



Proof: For r = X^ we set 



For r = Y^ we set 



Suppose r = sX^^. Then 






0r = K T, 






Suppose r = sY^. Then 



Qr = K Qs 

^, = H^VY^f?,. 

Note that the formulas conform to oo < oo and oo < oo. D 

3.3 Unambiguous Interval Temporal Logic 

Here, we extend unambiguous interval temporal logic (unambiguous ITL) to infinite words in such 
a way that it coincides with FO . In fact, we have two extensions with this property, one being 
eager and one being lazy. The syntax of unambiguous ITL is given by Boolean combinations and: 

T 1 (/^FaV I V'U^ I Ga I Ha I (^F'^V- I 9^LaV' I G| I H^ 

with a G r and 99, ij^ are formulas in unambiguous ITL. The name F^ derives from First-a and 
\-a derives from Last-a. As in unambiguous temporal logic, the atomic formulas are T, the eager 
modalities G^ and Hq, and the lazy modalities G^ and Hi. We now define, when a word a together 
with an interval (x; y) = {z € N U {00} | x < z < y} satisfies a formula (p in unambiguous ITL, in 
which case we write a, (x; y) \= 99. Remember that we have set 00 < 00. In particular (00; 00) = 
{00}. The atomic formula T is true for all intervals and the semantics of the Boolean connectives 
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is as usual. The semantics of the binary niodahties is as follows: 



a, {x; y) \= LfFaip iff Xa(x) is defined, Xa{x) < y, 

a,{x;Xaix)) \= ip and a, (Xa(x);y) |= V', 

a, {x; y) '^ iflaip iff Yaiv) is defined, Ya{y) > x, 

a,{x;Ya{y)) \= if and a,{Ya{y);y) \=^p, 

a, {x; y) ^ 93 F^ V iff Ki^) is defined, X^(x) < y, 

Q, (x;X^(x)) \=ip and q, (X^(x);y) \= i), 

", {x; y) ^(flii^ iff Yaiy) is defined, Y^(y) > x, 

a,{x;Y^{y)) ^ if and a,{Y^{y);y) ^ ip. 

The semantics of the atomic modalities is given by 

G-a = -(TF„T), H-a = -(TUT), 



<TF^T), 



H^ 



-ru 



V \/((TUT)F^T). 



ber 



Xh 



a h 




In the definition of H|, the disjunction on the right-hand side ensures that a, (oo; cxo) |= H| for 
every infinite word a £ T^ and every a € F. It will turn out that the inability of specifying the 
letters not in im(Q;) is crucial in the characterization of the fragment 112 H FO . Observe that only 
for the interval (oo; oo), there can be a 6 before the "first" b. Also note that for every finite interval 
of some word a, the formula Gq is true if and only if Hq is true. Whether a word a models a 
formula cp in unambiguous ITL (i.e., a \= if) or not is defined by 

a \= if iff a, (0; oo) \= ip, 

and the language defined by ip is L{ip) = {q G r°° | a \= ip}. 

Figure 3 depicts the situation for the formula {ipi F^ 
'0i)L(j((/92Fc'02) being defined on a. The main difference to 
rankers and unambiguous TL is that there is no crossing 
over in unambiguous ITL, e.g., in the situation depicted 
on the left side of Figure 2, the formula (TL;,(TFcT))LaT 
is false even though Y^ Y^ Xc is defined. 

In unambiguous ITL, the modalities Fa,Ga,F^,G| are 
future modalities and La,Ha,L^,H| are past modalities. 
An unambiguous ITL formula ip is a future-form^ula if in 
the parse tree of ip every past modality occurs on the left 

branch of some future modality, i.e., if it is never necessary to interpret a past modality over an 
unbounded interval. For C C {Fq, Lq, G^, Hq, F^, L^, G^, H^} we define the following fragments of 
unambiguous ITL: 

• ITL[C] consists of all formulas using only T, Boolean connectives, and temporal modalities 
in C, 

• ITL^[C] consists of all formulas using only T, positive Boolean connectives (i.e., no negation), 
and temporal modalities in C, 



a 

^Uh 



I \^bW^ 

ipi ipi 



V2 W2 



Figure 3: {ipi Fb V'l) U {^2 Fc ip2 



ITLf[C] consists of all future formulas using only 
modalities in C, 



Boolean connectives, and temporal 
ITL+[C] consists of ah future formulas in ITL+[C] n ITLf[C]. 
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Example 8 Consider the unambiguous ITL formulas 

^ = {TllT)\=iT and V' = T LUT F^ T) 

and a word a with b G im(a). The formula (p was used in the definition of the semantics of H| 
and, as already mentioned, is only true if the interval is (oo; oo). In contrast, tp is also true if the 
interval is (0, oo). 

The following two propositions describe a procedure for converting unambiguous ITL formulas 
into unambiguous TL formulas without introducing new negations. A similar relativization tech- 
nique as in our proof has been used by Lodaya, Pandya, and Shah [4] for the conversion of ITL 
over finite words into so-called deterministic partially ordered two-way automata (but without the 
focus on not introducing negations). Proposition 1 is the eager version, whereas Proposition 2 is 
the lazy version. As will follow from Theorems 1 to 5 we actually have equality for all inclusions 
in both propositions. 

Proposition 1 We have the following inclusions: 

ITL[F„U] C TL[X„Y,], 

ITL [Fa, La, Ga, Ha] C TL [Xa, Ya, Ga, Ha], 

ITL+[Fa,U,Ga] C TL+[Xa,Ya,Ga], 

ITLp [Fa, La, Ga, Ha] C TL^ [Xa, Yq, Gq], 

ITLF[Fa,La] C TLx[Xa,Ya]. 

Proof: Note that the atomic modalities Ga and H^ are expressible in ITL[Fa, La] as well as 
in TL[Xa,Ya]. For every ip G ITL[Fa, La, Ga, H^] we construct an equivalent formula V'(e;e) ^ 
TL[Xa, Ya, Ga, Ha] such that f(e;e) Contains a negation if and only if 99 contains a negation, and 
an Ha term appears in '/'(£;£) if ^^^ only if it appears in ip. This will prove the first three inclusions. 
For rankers q,r and if G ITL[Fa, La, Ga, Hq] we define p>{q-r) ^ TL[Xa, Ya, Ga, H^] such that a \= 
p>(q-r) if ^iid only if q{a) and r{a) are defined, q{a) < r{a), and 

a,{q{a);r{a)) \= ip 

with q{a) = for q = £ and r{a) = 00 for r = e. In particular, in the above situation q and r 
define the boundaries of an interval {q; r) parameterized by words a. The construction of piq-^r) is 
by structural induction. We will make extensive use of the formulas Qq, Qr and 'dr from Lemma 2 
with the convention Qq = T for q = e. The atomic formula T and Boolean connectives are as 
follows: 

T(g;r) = gT A rT A rgq 

{if A '4>)l^q-r) = y^{q;r) A V'(g;r) 
(V? V Tp)(q-r) = V>{q;r) V V(9;r) 

For the atomic ITL-formula Ga we set 

/(- N ^ f^fee) ^1^^ for r = e 

^"■'^ \T(,.,) Ag(GaVXai?.) forr/e 

Essentially, the term on the right-hand side says that the next a-position after the g-position is at 
least the r-position. In case of the atomic ITL-formula Ha we have to distinguish between r = e 
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and r ^ e. If r 7^ e, then the interval defined by (g; r) is finite and therefore in this situation Hq 
and Ga are equivalent: 



If r = e, then 



(Ha)(q;r) - (Ga){g;r) 



(Ha){g;r) = T(g.r) A ( Hq V g G^ 



i.e., either there are no or infinitely many a-positions, or there is no a-position after the (7-position. 
For the Fa-modality we define 

i.e., we verify 99 on the interval (g; ^Xq) and ij: on the interval {qXa, r). The La-modality is similar: 

i^ U ^)(g;r) = '^ {q;r) A V'(g;rYa) A V'(rYa;r) 

saying that (p and ^ are defined on the respective subintervals and that there is some a-position in 
the interval {q;r). 

Now, for every 93 G ITL[Fa, L^, Ga, Ha] and any rankers q, r, the formula ^(q-r) ^ TL[Xa, Y^, Ga, Ha] 
is a Boolean combination of formulas of the form Tfp.,,\, {Ga)(p-s)i ^^'^ (Ha)(p;s)- Moreover, every 
negation and every Ha-modality in f(q-r) is only caused by the respective operation in cp. This 
completes the proof of the first three inclusions. 

For the last two inclusions, we first observe that in our construction the following invariants hold: 

• If g and r are X-rankers with r ^ e, then ffq-r) S TLx[Xa, Ya, G^] for every formula 99 € 

ITL[Fa, La, Ga, Hq]. 

• If (7 is an X-ranker and r = e, then for every (p E ITL[Fa, La, Ga, Ha] and every ip G 

ITLF[Fa,La,Ga,Ha] We have {(f ^ a 1p) (q-r) G TLx[Xa, Ya, Gg]. 

Therefore, if 99 € ITLF[Fa, La, Ga, Ha], then v?(e;e) € TLx[Xa, Ya, Ga]- Hence, the fourth and the fifth 
inclusion follow. D 

Proposition 2 We have the following inclusions: 

ITLl^lK] C TL[X^a,Y^], 

ITL+[Fl,La,G^,H^] C TL+[XLY^,G^,H^], 

ITL+[F1,L^„H^] C TL+[X^,Y^,H^]. 

Proof: For 93 € ITL[Fa, La, G^, H^] we construct an equivalent formula ^(e;e) ^ TL[Xa, Ya, Gg, H|] 
such that <p(£-£) contains a negation if and only if ip does, and a G| term appears in fu-e) if and only if 
it appears in ip. We use the following construction. For lazy rankers q, r and ip G ITL[Fa, La, G^, H|] 
we define ^(q-r) ^ TL[Xa,Ya,G|, H|] such that a \= ^(q-r) if a-iid only if q{a) and r(a) are defined, 
q{a) < r{a), and 

a, {q{a);r{a)) \= 99 

with q{a) = for q = £ and r{a) = 00 for r = e. In particular, in the above situation q and r 
define the boundaries of an interval (g; r) parameterized by words a. The construction of ^(q-r) is 
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by structural induction. We will make extensive use of the formulas Qr and "Qr from Lemma 3. The 
atomic formula T and positive Boolean connectives are as follows: 



T(q;r) 


= gT A rT A qQr 


{~^'P)(q;r) 


= ^{q;r) A ^^{q-r) 


[if A V)(g;r) 


= V>{q;r) A Tp(^q-r) 


[^ V V)(g;r) 


= V>{q;r) V Tp(^q-r) 



with Qr = T for r = e. For the atomic ITL-formula H| we set 



m.^ ^ T(,,,,)Ar(H|vY^^,) ii q ^ e 

[ I (£;r) A r H^ It g' - e 

This is consistent with the definition that for all a € F and all infinite words a G F"^ we have 
a, (oo; oo) |= H| in unambiguous ITL. The atomic modality Gq is slightly more technical due to its 
behavior on the interval (oo; oo). For r = e we set 

(GD(g;e) = ~^ {q;s) A g G| . 

If r 7^ £ is an X*-ranker we define 

(GD(,;r) = T(,,,)Ar(H|vYi^,) 

with "dq = 1. for q = e. Therefore, ii q = e, then we can omit the term Y^ iDq in the above formula. 
Finally, for a non-empty Y'^-ranker r we use 

(GDfer) = T(,;,)A('(r(H^VY^^,)A VYfe^q) VgGi 

with B = alphp(r) is the set of letters which occur in some modality of the ranker r. As before, 
we set Ti)q = J- for q = e. The above formula distinguishes two cases. The first case is that r 
determines a finite position (after some last occurrence of a letter b & B from the ranker r there is 
no 6-position). In this case either there is no a-position before the r-position, or the first a-position 
before the r-position is on the left-hand side of the g-position. The other case is that r leads to the 
infinite position and then we want to see no a-position on the right-hand side of the g-position. 
For the F^-modality we define 

(^ K V')(<7;r) = T(g.^) A ^^q,qxi) A ^(qXi;r) 

and for L^ we set 

Now, for every f € ITL[F^, L^, G|, H|] and for all lazy rankers q, r, the formula f(q-r) ^ TL[X^, Y^, 
G|, H|] is a Boolean combination of formulas of the form T(p.s), (G|)(p.s), and (H|)(p.^'). Moreover, 
every negation and every G|-modality in f(q-r) is due to the respective operation in ip. This 
completes the proof. D 
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4 The Fragment FO^ 

This section contains various ITL, TL, and ranker characterizations using the eager variants. We 
postpone characterizations in terms of the lazy fragments to Theorem 5. 

Theorem 1 For L C T°° the following assertions are equivalent: 

1. L is definable in FO^. 

2. L is definable in ITL^[Fa, L^, G^, Hq]. 

3. L is definable in ITL[Fa, L^]. 
4-. L is definable in TL[Xa, Yq]. 

5. L is definable in TL"*"[Xa, Y^, Gq, Hq]. 

6. L is a positive ranker language with atomic modalities G^ and Hq. 

7. L is a ranker language. 

Lemma 4 Let ACT. Then A^ is definable in ITL^[Ga], and A'™ is definable in ITL^[Fa, Ha]. 

Proof: A letter a G F does not appear in a word a if and only if a ^ G^, and a appears infinitely 
often in a word a if and only if a \= (TFqT) A H^. Hence 

A°° is defined by A G^ and 

^™ is defined by /\ (T F^ T) A H^ . D 

a<^A 

Lemma 5 Every unambiguous monomial L = A\ai ■ ■ ■ ^^a^A^-^ is definable in ITL^[F(j, L^, G^]. 

Proof: We perform an induction on k. For A; = we have L = A'^ which is definable in ITL^[Ga] 
by Lemma 4. Let k > 1. Since L is unambiguous, we have {ai, . . . ,afc} ^ ^i H ^a;+i; otherwise 
(ai • • • Ofc)^ admits two different factorizations showing that L is not unambiguous. First, consider 
the case Oj ^ Ai and let i be minimal with this property. Each word a € L has a unique factorization 
a = uoiP such that Cj alph(ti). Depending on whether the first Oj of a coincides with the marker 
Oi or not, we have 

u(^A\ai---A*, /3 G ^*+iai+i • • • A^afc^^i or 

u G A\ai ■■■A*, aie Aj, /3 G A*aj ■ ■ ■ ^^a^A^i 

with 2 < j < i. In both cases, since L is unambiguous, each expression containing u or f3 is 
unambiguous. Moreover, each of these expressions is strictly shorter than L. By induction, for 
each 2 < j < k, there exist formulas </?, ^ G ITL"'"[Fa, Lq, G^] such that L{ip) = A^ai ■ ■ ■ A'^ and 
L{ip) = A*aj ■ ■ ■ AlakA'^_^^. By the above reasoning, we see that L is the union of (at most i) 
languages of the form 

(LMn(F\{aar)aiL(V^) 

and each of them is defined by ipfaii^- 

For Qi ^ ^fc+i with i maximal we consider the unique factorization a = uoifi with ai alph(/3) 
and again we end up with one of the two cases from above, with the difference that 1 < i < j < k 
in the second case. Inductively L is defined by a disjunction of formulas (/? L^. -0. D 
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The language L(Ha) is definable in 112 H FO using the following formula 

Va;3y : A(x) =a V (y > x A X{y) = a) , 

but it is not definable in S2. Therefore, we have to exclude the case r = Hq in the following lemma. 

Lemma 6 Let r j^ Ha be a non-empty ranker, potentially with atomic modality Gq or Hq, then 
L{r) is definable in FO fl S2. 

Proof: By induction on A; = |r| we construct formulas /U,.(x) G FO and cj,.(x) G S2 with one free 
variable such that for every word a € r°° we have 

a \=3x: fir{x) iff r{a) is defined iff a \=3x: ar{x) 
and if r(a) is defined, then 

a,x\=Hr{x) iff X = r{a) iff a, x^o"j.(x). 

For ;Ur we only use the variables x and y. By interchanging the names, we can always choose 
whether x or y is the free variable of Hr- The formula ar = o"j.(xfe) will have the form 



For r = G;5 we set 



3xfc-i---3xiVy: Urixk, ■ ■ ■ ,xi,y). 
Iir{x) = ar{x) = yy:X{y)j^a. 



For r = Ya we define 

fir{x) = ar{x) = Vy: A(x) = a A (y < x V X{y) 7^ a). 

Note that if a contains infinitely many a's, then q ^ 3x: ^r{x) for r = Yq. The case r = X^ is 
symmetric; we only have to replace "<" with ">" in the above formula. 
Let now k = \r\ > 1. We first consider the formula Hr- If r = sX^, then 

J A(x) = a A 3y <x: fisiv) A 

[ Vy < x: (A(y) /a V 3x>y: ^Xs{x)) 

saying that x is an a-position greater than the s-position (in particular, r is defined) and that all 
a-positions smaller than x are not greater than the s-position. The case r = sY^ is symmetric; we 
only have to replace all "<" by ">" and ">" by "<" in the above formula. If r = sGg, then 

IXr{x) = fJ-s{x) A Vy > x: A(y) / a, 

and for r = s Ha we use the same formula, but y > x is replaced by y < x. 

We now describe the construction of the formula Ur- Suppose r = sXa and let as{xk-i) = 
3xfc_2---3xiVy: i^sixt-i, ■ ■ ■ ,xi,y). Then 

3xfc_i • • • 3xiVy: A(xfc) = a A 
o-r{xk) = < Xfc > Xfc_i A z/s(xfc_i,...,xi,y) A 
, {y < Xk^i V y > Xfc V A(y) / a) . 

The semantics of ar is as follows: Xk is labeled by a and it is strictly greater than the s-position 
Xfc_i; moreover, there is no a-position strictly between Xk-i and x^. Hence, x^ determines the 
r-position. As before, the case r = sY^ is symmetric; we only have to replace ">" by "<" and we 
have to interchange ">" and "<" in the above formula. If r = s G^, then 

arixk) = 3xfc-i---3xiVy: z/s(xfc,Xfc-2,--- ,a:i,y) A (y < Xfc V A(y) / a), 

and for r = s H^ we use the same formula, but y < x^ is replaced by y > x^. D 



16 



Proof (Theorem 1): We show "1 ^ 2 ^ 3 ^ 4 ^ 7 ^ 1" and "2 ^ 5 ^ 6 ^ 4". 

"1 => 2": Every FO^-definable language is a finite union of languages of the form PfiA™^ with an 
unambiguous monomial P and ^4 C F, see [2]. Since ITL"'"[Fa, Lq, Gq, Hq] is closed under finite unions 
and finite intersections, it suffices to show that A^™ and P are definable in ITL'''[Fa, Lq, Gq, Hq]. 
This follows from Lemma 4 and Lemma 5, respectively. 

"2 ^ 3" and "6 ^ 4" are trivial. "3 ^ 4" and "2 ^ 5": Proposition 1. "4 ^ 7" and "5 ^ 6": 
Lemma 1. 

"7 => 1": By Lemma 6, for every ranker r the language L{r) is FO -definable. Since FO is 
closed under Boolean operations, every ranker language is FO -definable. D 



5 The Fragment S2 n FO^ 

In the following, we show that S2 n FO^ admits characterizations in terms of eager ITL, TL, and 
rankers. 

Theorem 2 Let L C r°° . The following assertions are equivalent: 

1. L is definable in S2 and FO . 

2. L is definable in ITL^[Fa, L^, G^]. 

3. L is definable in TL"'"[Xa, Yq, Ga]. 

4- L is a positive ranker language with atomic modality Ga- 

5. L is a ranker language with atomic modality Ga with the restriction that all Y -rankers are 
positive. 

Note that we cannot use lazy counterparts in the above characterizations, since for example Y^ X^ 
is defined if and only if there are infinitely many a's, but this property is not S2-definable. 

Lemma 7 Let r be an X-ranker with atomic modality Ga- Then T°° \ L{r) is T,2-definable. 

Proof: If r is an X-ranker which is not defined on a, then there is a longest prefix p of r such that 
p is defined on a. Write r = pq. If the first modality q is of the form X^ or Yq or G^ , then we set 
s = pGa or s = pHa 01 s = pXa, respectively. Note that if q starts with Y^, then p is a non-empty 
X-ranker. In any case s ^ H^, and therefore, L{s) is S2-definable by Lemma 6. Hence, we find a 
finite set of S2-definable languages whose union is r°° \ L{r). But S2 is closed under union and 
thus r~ \ L(r) is i;2-definable. D 

Proof (Theorem 2): "1 => 2" : A language L is definable in S2 n FO if and only if L is a union of 
unambiguous monomials, see [2]. Since ITL+[Fa, Lq, G^] is closed under union, it suffices to show 
that every unambiguous monomial is definable in ITL''~[Fa, L^, G^]; this is exactly Lemma 5. 

"2 ^ 3": Proposition 1. "3 ^ 4": Lemma L "4 ^ 5": trivial. 

"5 =^ 1": Since languages in S2 H FO are closed under finite union and finite intersection, the 
claim follows from Lemma 6 and Lemma 7. D 

Over finite words, the fragments FO^ and A2 coincide [8]. In particular, FO^ PI S2 = FO^ for 
finite words. Since finiteness of a word is definable in FO H S2, we obtain the following corollary 
of Theorem 2. 

Corollary 1 A language L C T* of finite words is definable in FO^ if and only if L is a positive 
ranker language with atomic modality Gq. 



17 



6 The Fragment A2 

Over infinite words, the fragment A2 is a strict subclass of FO^. In this section, we show that 
A2 basically is FO with the lack of past formulas and past-rankers. Since eager future formulas 
and future-rankers coincide with their lazy counterparts, all of the characterizations in the next 
theorem could be replaced by their lazy pendants. 

Theorem 3 Let L C r°° . The following assertions are equivalent: 

1. L is definable in A2. 

2. L is definable in ITLp [Fq, La,Ga]. 

3. L is definable in ITLF[Fa, Lq]. 
4- L is definable in TLx[X(i, Y^]. 

5. L is definable in TLx[Xa, Y^, Ga]. 

6. L is a positive X-ranker language with atomic modality G^. 

7. L is an X-ranker language. 

Lemma 8 Every language definable in A2 is definable in ITLp [F^, L^, G^]. 

Proof: It is known that a A2-definable language is a finite union of unambiguous monomials 
L = A^ai ■ ■ ■ yl^OfcA^-i^ such that {a-,-, . . . , Ofc} ^ Aj for all 1 < j < k, see [2]. Let i be minimal such 
that Oi ^ Ai and for each word a & L consider the factorization a = uaif3 such that Oj alph('u). 
There are two cases: 

u G Alai ■■■A*, p€ ^r+iOi+i • • • ^feOfcA^i or 

u G Alai ■■■A*, ai€ Aj, /? G A*aj ■ ■ ■ A^a^^^i 

with 2 < j < i. In each case the expression P = A*aj ■ ■ ■ A'^akA'^_^_^ containing /3 is unambiguous 
since L is. Moreover, the expression is shorter than that for L and we have {ai, . . . ,ak} ^ Ai for 
all j < i < k. By induction P is definable by an ITLp [F^, L^, Ga]-formula tjj. By Lemma 5, we 
get an ITL''~[Fa, \-a, Ga]-formula ip defining the monomial A^ai ■ ■ ■ A^. Therefore L is the union of 
languages of the form cp F^. tp each of which is an ITLp [Fq, L^, Ga]-formula by definition, given the 
constraints imposed on ip and ip. D 

Proof (Theorem 3): We show"! ^ 2 ^ 3 ^ 4 ^ 7 ^ 1" and "2 ^ 5 ^ 6 ^ 4". 

"1 ^ 2" is Lemma 8 and "2 ^ 3" as weh as "6 ^ 4" are trivial. "3 ^ 4" and "2 ^ 5": Propo- 
sition 1. "4 ^ 7" and "5 =^ 6": Lemma 1. 

It remains to show "7 => 1". By Lemma 6 and Lemma 7, for every X-ranker r the languages 
L(r) and T°° \L{r) are definable in A2 = S2 0112. Since A2 is closed under finite unions and finite 
intersections, the claim follows. D 

7 The Fragment 112 n FO^ 

In this section we give characterizations of the fragment 112 H FO^ in term of the lazy variants of 
ITL, TL, and rankers. We cannot use the eager variants, since Y^ says that there are only finitely 
many a's, but this property is not n2-definable. Also note that a, (oo; oo) \= H^ for Hq = -i(TL^T) 
if and only if a ^ im(a), i.e., if and only if a occurs at most finitely often. As before, this property 
is not n2-definable. This is the reason why we did not define H| simply as Hq. 
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Theorem 4 Let L C T°° . The following assertions are equivalent: 

1. L is definable in II2 and FO . 

2. L is definable in ITL+[F^, ^, H^]. 

3. L is definable in TL+[X^, Y^, H|]. 

4- L is a positive lazy ranker language with atomic modality H|. 

5. L is a lazy ranker language with atomic modality H| with the restriction that all Y^ -rankers 
are positive. 

Lemma 9 The complement r°° \ L of every unambiguous monomial L = A^ai ■ ■ ■ A^.akA'^,-^ is 
definable in ITL+[F^„, ^, H|]. 

Proof: We perform an induction on k. For A; = we have L = A"^ and T°° \ A"^ is defined by the 
ITL"*" [F^J-formula 

Let now k > 0. Since L is unambiguous, we have {oi, . . . , at} 2 ^1 n A^+i; otherwise (ai • • • a^)^ 
admits two different factorizations showing that L is not unambiguous. First, consider the case 
Oj Ak+i and let i be maximal with this property. For a € T°° we have a L if and only if one 
of the following conditions is true: The first condition is Cj alph(a) or Oj € im(a) and the second 
condition is Oj G alph(a) \ im(a) and the following holds for a = uai(3 with Oj alph(/3): 

• u Alai ■■■Af ox p (^ ^I+i^i+i • • • ^^1, and 

• for al\i < j <k with Oj G Aj we have: u A\ai ■ ■ ■ A^ or /3 A*aj ■ ■ ■ Afj^^. 

The monomials A\ai- ■ ■ A^ for i < j < k and A*aj ■ ■ ■ A^,^ for i < j < k are unambiguous and 
have degree smaller than k. Hence by induction, there exist formulas ipj, ipj G ITL'''[F^, L^, H|] such 
that L{ipj) = r~ \ Alai ■ ■ ■ Af and L{^j) = F^ \ A*aj ■ ■ ■ A'^_^^. This yields the following formula 
for the complement of L: 

H^- V (T li^ (T F^^ T)) V 

The first line captures the first condition from above, since T L^, (T F^, T) is true if and only if Oj 
appears infinitely often. Note that a term T saying that Oj occurs only finitely often and at least 
once is not required in the above formula, even though it would be natural to include it on the 
right-hand side of the second disjunction at the outermost level (if T is false, then one of the first 
two terms is true). Hence, we do not have to care about the case in which the right interval of 
some L^, -modality is (00; co). 

Let now Oj Ai and let i be minimal with this property. For a G T°° we have q L if and 
only if one of the following conditions is true: The first condition is aj alph(a) and the second 
condition is Oj G alph(a) and the following holds for a = uaifi with Cj alph(u): 

• n A\ai ■■■Af ox I5(^ ^I+i^i+i • • • ^^1, and 

• for ah 1 < j < i with ai G Aj we have: u A\ai ■ ■ ■ A°° or /3 A*aj ■ ■ ■ A^j^-^. 
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The monomials A^ai- ■ ■ A'^ for 1 < j < i and A*aj ■ ■ ■ A^,-^ for 1 < j < i + 1 are unambiguous 
and have degree smaller than k. Hence by induction, there exist formulas ^j,iJj G ITL'*'[F^, L^, H|] 
such that L{ipj) = r°° \ A\ai ■ ■ ■ A°° and L(V'j) = r°° \ A*aj ■ ■ ■ A'^_^-^. This yields the fohowing 
formula for the complement of L: 

H^- V (((^. n. T) V (T Fi^ ^,+i)) A /\ ((v., F^^ T) V (T F^, ^,)))- 

l<j<i,aiS:Aj 

This completes the proof. D 

For the inclusion of positive lazy rankers in 112 H FO , our proof is based on a characterization 
of this fragment in terms of the alphabetic topology over finite and infinite words [2]. A base of 
the open subsets of this topology is given by the sets of the form uA°° for u € T* and ^ C F. A 
language is closed if its complement is open. The closure L of a language is the intersection of all 
closed sets containing L. A word a belongs to L if for every finite prefix u of a there exists 7 G A°° 
for A = im(a) such that uj £ L. A language L is closed if and only if L C L. 

Lemma 10 Let L C F°°. If L is a lazy ranker language with atomic modality H| such that all 
Y' -rankers are positive, then L is closed in the alphabetic topology. 

Proof: A lazy ranker starting with a future modality is equivalent to its eager counterpart. For 
pure eager X-rankers r we have shown in Lemma 6 and Lemma 7 that both L{r) and F°° \ L{r) 
are S2-definable, and hence, L{r) and F°° \ L{r) are open in the alphabetic topology, i.e., L{r) is 
clopen. For every X-ranker r we have L{r Hq) = L{r) \ L(r Y^). Therefore, every lazy X'^-ranker r 
(possibly with atomic H|-modality) generates a clopen language L{r). 

It remains to show that L{r) is closed for every Y^-ranker r. The ranker r may end with H|. We 
show that the closure of L{r) in the alphabetic topology is contained in L{r). Suppose a € L{r) 
and let A = im(a). Let s be the maximal pure prefix of r, i.e., r G s {e, H| | a G F}, and let 
k = \s\ + 1. Write a = uvi---Vk/3 with alph(?;j) = A and /3 € A°° n ^™. Since a € L{r), 
there exists 7 G A°° such that uvi ■ ■ ■ Vkj G L{r), i.e., r is defined on the word a' = uvi ■ ■ ■ v^^. 
If s{a') = 00, then s{a) = 00, since im(a') ^ A = im(a). Moreover, r(a) is defined, since 
alph(a') = alph(iit;i) = alph(a). Let now s(a') 7^ 00. We have to distinguish two cases. 

The first case is that all letters occurring in s are from A. Then s{a) = 00 (in particular s{a) 
is defined) and s{a') > \uvi\ by choice of k. This shows, that r{a) is defined if r = s. Now, if 
r = s H|, then a alph(nt'i) = alph(a), and hence r{a) is defined. 

The second case is s = si Y^ S2 such that b ^ A and all letter from si are in A. Note that we 
cannot have the situation s = si X^ S2 with b ^ A and all letter from si are in A, since then s 
would be undefined on a'. Then si Y^(a') = si Y^(a) < |m|. Again, by choice of k, it follows that 
si Y^ S2{a') = si Y^ 82(0). Therefore, even if r ends with H|, we see that r(a) is defined. 

In any case, we have a G L{r). This completes the proof. D 

Proof (Theorem 4)-' "1 => 2" A language is in 112 H FO if and only if it is the intersection of 
complements of unambiguous monomials, see [2]. By Lemma 9, such complements are definable in 
ITL"'"[F^, L^, H|]. Since ITL^[F^, L^, H|] is closed under intersection, the claim follows. 

"2 => 3" is Proposition 2 and "3 =^ 4" follows from Lemma 1. "4 =^ 5" is trivial. 

"5 =^ 1": It is easy to see that any lazy ranker language is definable in TL[Xa,Ya]. Hence, by 
Theorem 1, any such language is in FO . Closed languages are closed under finite union and 
intersection. Therefore, by Lemma 10, lazy ranker languages with atomic modality H| and with 
only positive Y^-rankers are closed in the alphabetic topology. Languages which are closed in the 
alphabetic topology and which are FO -definable are in n2, see [2]. D 
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For completeness, we give a counterpart of Theorem 1 using the lazy versions ITL, TL, and 
rankers. 

Theorem 5 For L C T°° the following assertions are equivalent: 

1. L is definable in FO . 

2. L is definable in ITL+[F^, ^, G^, H|]. 

3. L is definable in ITL[F^,L^]. 
4- L is definable in TL[X^,Y^]. 

5. L is definable in TL+[X;„ Y^, G|, H^]. 

6. L is a positive ranker language with atomic modalities G| and H^. 

7. L is a lazy ranker language. 

Proof: "1 => 2": Every FO -definable language is a finite intersection of languages of the form 
r°° \ (P n yl^™) = (r°° \ P) U (r~ \ A^"^) with an unambiguous monomial P and A C F, see [2]. 
Since ITL"'"[F^, L^,G|, H|] is closed under finite unions and finite intersections, it suffices to show 
that F°° \ P and F~ \ yl™ are definable in ITL+[F^, ^, G|, H|]. For F~ \ P, this is shown in 
Lemma 9 and F°° \ A^^ is defined by 

\/{Tll{TXiT)) V V(TUq)- 

b^A b£A 

"2 =^ 3" is trivial. 
"3 ^4": Proposition 2. 

"4 => 5": Using a similar approach as in Lemma 1 we can remove all negations. For this, we 
apply De Morgan's laws and the following rules for moving negations to the innermost level: 






Note that X^ _L = _L = Y^_L. After incorporating all constants, we end up with an equivalent 
formula in TL+[X^„Y^,G|,H|]. 

"5 => 6" is proved in Lemma 1 (even though not stated explicitly due to lack of space) . 

"6 ^ 7" is trivial. 

"7 => 1" : Every lazy X-ranker is equivalent to its eager counterpart and hence, it generates a 
TL[Xa, Ya]-definable language. Let now r = Y^^ Z^^ ' " " ^a^. with each Z^^ being either X^. or Y^.. 
For A C F we define the following macro: 



a&A 



ACim = /\(X,TA-Y,T) 

Now, L{r) is defined by 

Y ({ai,...,ai}Cim A Z^^^^ • • • Z 



I 



tti + l / Ij + i 

Therefore, every lazy ranker language is TL[Xa, Y(j]-definable, and by Theorem 1 it is FO -definable. 

D 
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8 Conclusion 

We have given an eager and a lazy generalization of rankers for infinite words. Together with 
the usual rankers over finite words, we obtained combinatorial descriptions of various fragments of 
first-order logic F0[<] over finite and infinite words. Without negation the eager variant cannot 
express that there are infinitely many occurrences of some letter. This leads to a characterization 
of the fragment S2nF0 . Similarly, we cannot say that some letter occurs only finitely often in the 
lazy version, and this yields 112 H FO . Both eager and lazy rankers are suitable for describing FO 
and A2. Intermediate steps in all our proofs have been unambiguous ITL and unambiguous TL — 
both in an eager and a lazy variant. The following table summarizes some characterizations of the 
fragments. For conciseness we introduce some additional terminology. By Rankers [C] we denote 
the class of ranker languages with atomic modalities C. If C is empty we simply write Rankers. 
The positive fragment is denoted by Rankers^ [C] and Rankersx[C] are the X-ranker languages. If 
we prepend an i we mean the respective lazy pendant. 



FO-Logic Interval Logic 



Temporal Logic 



Rankers 



FO^ 



A, 



ITL[F„ U] 

ITL+[F„,U,Gs,H 
ITL+[F^,^,G|,H 



TL[X„Y„] 

JUKX] 

TL [Xfl, Ya, Ga, Ha] 

TL+K,Y^,G|,H^] 



Rankers 

£- Rankers 

Rankers"*" [Ga, Ha] 

^-Rankers+[G|,H|] 



ITLF[Fa,U] 

ITLF[F^a,^] 
ITL+[F„U,Ga] 

ITL+[F^,L^,G^,] 



TLx[Xa,YJ 
TLx[X^,Yi] 

TLj^[Xq,, Yq, Go] 

TL+K,Y^,G|] 



Rankersx 

£-Rankersx 

RankerSx[Ga] 

^Rankers+[G'a] 



Thm. 1 
Thm.5 



E2 n FO^ 


ITL+[F„U,Ga] 


TL+[Xa,ya,Ga] 


Rankers"*" [Gq] 


Thm. 2 


U2 n FO^ 


ITL+[F^,^,H^,] 


TL-^[KX,Hi] 


£-Rankers+[H|] 


Thm. 4 



Thm. 3 



Open Problems 

Rankers over finite words have been introduced for characterizing quantifier alternation within FO . 
We conjecture that similar results for infinite words can be obtained using our generalizations of 
rankers. 

Over infinite words, the class of X-ranker languages correspond to the fragment A2. Over finite 
words however, X-ranker languages form a strict subclass of A2 (which for finite words coincides 
with FO ). An algebraic counterpart is still missing. The main problem is that X-ranker languages 
do not form a variety of languages. 

A well-known theorem by Schiitzenberger [5] implies that over finite words, arbitrary finite unions 
of unambiguous monomials and finite disjoint unions of unambiguous monomials describe the same 
class of languages. In the case of infinite words, it is open whether one can require that unambiguous 
polynomials are disjoint unions of unambiguous monomials without changing the class of languages. 

Acknowledgments. We thank Volker Diekert for a suggestion which led to Theorem 2. We also 
thank the anonymous referees for several useful suggestions which helped to improve the presenta- 
tion of the paper. 
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